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Finite State Automata Built on DNA 
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This paper describes a non-deterministic finite-state automaton based on DNA strands. 
The automaton uses massive parallel processing offered by molecular approach for com- 
putation and exhibits a number of advantages over traditional electronic implementations. 
This device is used to analyze DNA molecules, whether they are described by specified 
regular expression. Presented ideas are confirmed by experiment performed in a genetic 
engineering laboratory. 
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1. Introduction 


The paper concerns moleclar computations i.e. new discipline, which uses molecules 
to performing calculations [1, 2]. The described method uses DNA (deoxyribonu- 
cleic acid) molecules for building finite state automata. The double helix of DNA is 
formed from two separate DNA strands, connected together (head-to-toe) by hydrogen 
bonds. The DNA strands may be viewed as a chain of nucleotides. There are four 
nucleotides: adenine, cytosine, guanine, and thymine, abbreviated to A, C, G, and T 
respectively. Each strand has a natural orientation, denoted (according to chemical 
convention) as 5' and 3' end. The hydrogen bond is selective, A bonds with T, and G 
bonds with C, the pairs (A, T) and (G, C) are complementary. The DNA strands are 
complementary ifthey are built from complementary nucleotides. More information 
about DNA and basic operations (i.e. hybridization, denaturation, ligation, cutting, 
PCR) from computer scientists point of view can be found in [3—5]. 

An alphabet is a finite nonempty set of symbols. Symbol over some alphabet 
>, denoted in this paper by a small letter or digit, is represented by a sequence 


* Correspondence to: Robert Nowak, Warsaw University of Technology, ul. Nowowiejska 15/19, 
00-665 Warsaw, Poland, e-mail: r.n.nowak(ag)elka.pw.edu.pl 
Received 19 February 2008; Accepted 10 June 2008 


4 R. Nowak, A. Plucienniczak 


of consecutive nucleotides of length n, x denotes the sequence complementary to 
the sequence representing symbol x, e.g. if x is represented by ATCCCA (sequences 
are written from 5' to 3’ end), the complementary sequence is 3’ - TAGGGT-5', 
thus x is TGGGAT. A string over given alphabet is any finite sequence of symbols 
(e.g. £, a, b, aa, ab, aab, are the strings over >= (a,b 1, e represents empty string). 
The length of a string R (the number of symbol occurrences in R) is denoted by |R], 
e.g. |aab|=3, |e| - 0. The strings are represented by DNA strands, and denoted in this 
paper by capital letter. For example, consider the alphabet >= {a,b}, where symbols 
are represented by ATCCCA, GGTCCT respectively. The DNA strand for R— abb has 
sequence ATCCCAGGTCCTGGTCCT. 

A set of strings over given alphabet is called a formal language. A formal 
grammar is a precise description of a formal language. The regular languages are 
the simplest class in the Chomsky hierarchy of formal grammars. The right-linear 
grammar is quadruple G=(N,T,P,g0), where N is a nonterminal alphabet, T is a 
terminal alphabet T c X, qo is the starting symbol (axiom) qo € N, and P is the set 
of production rules. Production rules conform the pattern 1— a2 or 1a, where 
1eN,2eN,anda e T. In this work numbers denote nonterminal symbols, letters 
denote terminal symbols. In practise the regular expressions are commonly used 
to describe regular language. For any regular expression an equivalent right-linear 
grammar can be constructed. More information about languages and grammars can 
be found in [6]. 

Decision whether a given string belongs to a given regular language is undertaken 
by a finite state automaton. The finite state automaton can be constructed [6] for any 
regular language. If the length of regular expression describing given regular language 
is |R|, then simple algorithm (of linear time complexity for electronic computer) can 
construct a non-deterministic finite state automaton. The number of states (memory 
complexity) is O(|R|), and time complexity to analyze string X is O(|R|*|X]). The de- 
terministic finite state automaton has the number of states exponentially dependent 
on length of regular expression, so memory complexity is O(2/4!), and the analysis 
for string X takes O(|X]) steps. A sample finite state automaton is depicted in Fig. 1. 
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Fig. 1. Finite state automaton and the production rules for the language a(a|b)*b 
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The automaton recognizes strings belonging to the language a(a|b)*b. The grammar 
G=(N,T,P,q0), where N= {0,1}, T= {a,b}, q0—0, and P showed in Fig. 1 generates 
strings belonging to the language. 

The idea described in this paper is to build a non-deterministic finite state automa- 
ton in vitro. Such a device uses massive parallelism given by molecular approach, 
and has size complexity (understood as a number of different molecules) O(|R|), 
where R is regular expression describing given language. The analysis for string X 
has O(|X]) time complexity. 


2. Molecular Production 


The molecular production is a biological system which conditionally creates designed 
DNA strand. It is the basic element in a molecular automaton used to implement 
transition. 

The molecular production, denoted 4 — B, creates string XB if and only if the 
input is XA (A, B, X are sub-strings, |4|7 0). Such system checks if the sequence 
of nucleotides representing condition (here A) is presented at the 3’ end of the input 
string, and then creates the output string: the DNA strand is copied from the input, 
but the condition sequence is replaced by the sequence representing the result 
of production (here B). Therefore, for the input XA, the XB is obtained (Fig. 2). It 
should be mentioned, that XA is also presented in the output, because the input and 
the output are not separated. 

If the strand representing the input string has not the condition sequence at the 
3' end, the molecular production creates nothing. For example production 4 — B for 
input XC, where C= A, provides only XC (Fig. 2). 


* 
Molecular 
Input string Ex» production mi Output string *nunf 


XC EP» A-> B XC 


TP 


Fig. 2. Molecular production AB; X, A, B, C are strings, and |A|>0, A=C 
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3. Molecular Finite State Automaton 


3.1. Reductions for Right-linear Grammars 


Theorem: 

Ifstring S belongs to a language generated by the right-linear grammar G=(N,T,P,q0), 
then it can be reduced to the string q0 (axiom). The following reduction rules are 
used for the last symbols in a string: 

e° q-—0when0- a e P, 

° al —0 when0 — al e P. 


Lemma: 
When string is generated from the axiom (q0) by the right-linear grammar it has at 
most one non-terminal symbol. This is the last symbol of the string. 


Proof of lemma (mathematical induction): 
Assume, that S,— w,w,...w, A,, where w; e T, A, e N. S, should be obtained from 
S, by production 4,—w,.,4,., (it retain the condition) or by production A4,  w,., 
(also retains it). 

Generated strings are: qyW,4,—W(W;459...W(W»...W, 44, 4 9 WiW....W,, 
where w, e T, A;e N. 


Proof of theorem: 
String $,=wiw....w, can be generated by right-linear grammar G — (N,T,P,4,), only 
from string S, ,— w4w;...w, 4A, ,, where A, , —w, € P. Therefore strings S, , can be 
constructed from S, by reductions w,— A, ,. If P does not contain production rule 
A, 1 Wp then string wiw....w, does not belong to language generated by G. 
String S,7wiw,...w, A, can be created only from strings wiw;...w, 44, i, where 
A, 1 —W, A, € P. If production rule A, ,— w, A, € P, then string Sn=w,w...w, A, 
does not belong to language generated by G. Strings S, , are constructed from S, by 
reductions w, 4, — A, ,. Because |S, ,|-|S,|-1, after n-1 steps, the set of strings of 
length equal 1 is obtained. If the axiom (qo) is in this set, the string wiw....w, A, can 
be generated by G. 


3.2. Molecular Automaton Based on Reductions 


For a given language the corresponding right-linear grammar is constructed. To 
analyze the input string the reductions (described in the theorem) are performed. 
If the axiom (starting symbol) is obtained the input string belongs to the grammar 
(is accepted). 

Such an idea is the core of the molecular automation. This device takes advan- 
tage of molecular production to implement reductions. The algorithm is shown in 
Fig. 3. 


Finite State Automata Build on DNA 7 


molecular 


productions YES 
FALSE TRUE 


Fig. 3. Molecular automaton — algorithm 


sepa 


Firstly, the automaton is prepared and DNA strand representing the input string 
Is added to a vessel. 

Then k steps (where k=|S|) of productions and separations are performed. A fter 
each production the last symbols from the string could be reduced. The separation 
removes the input string (present in the output of the molecular production), i.e. for 
input XA and molecular production À — B only XB is kept. It should be noted that 
each molecular production could work independently of each other, so in a single 
step many different reductions should be performed. 

Finally the axiom is detected. If such string is obtained, the answer is true, the 
input string is accepted by automaton. Otherwise, the answer is false. 


3.3. Example 


Consider the regular language a(a|b)*b and the right-linear grammar shown in Fig. 1. 
The reductions (molecular productions) for this language are: b> 1, 51 1,a191 
and al — 0. 

The molecular automaton performs 3 steps when the input string abb is analyzed: 
abb — abl—»al- {0,1}. The axiom (symbol 0) is present, thus abb belongs to the 
given language. 

For bbb and the same automaton the reductions are: bbb > bb] > b1 > 1. 
The axiom is not present, so bbb is rejected. 
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4. Molecular Production — Realization 


The molecular production is the basic element used to build the automaton considered 
in this paper. Below the details of this process are presented. 
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Fig. 4. Molecular production 4 B. The production engine has sequence AB 


The process of the molecular production (illustrated in Fig. 4) needs the DNA 
strand called a production engine. This strand contains two parts: the first is com- 
plementary to the conditional part of the molecular production (i.e. A for 4 B), 
the second has nucleotides representing the result of production (i.e. B for A — B). 

Firstly, the production engine partially hybridizes to the input string (only if 
it has the proper sequence on 3' end). Next, the special polymerization with DNA 
polymerase “Jump” is performed. A strand built by polymerase has the sequence 
complementary to the output string. Finally, the polymerization is performed, so 
the output string is produced. It should be noted that if the hybridization does not 
occur (the production engine and the input string are not partially complementary), 
polymerase does produce only a strand complementary to the production engine, 
which can be easily removed. 

The probability of DNA polymerase “jump” is very small, so in experiments the 
PCR is applied. PCR (and dilution) can also remove the strands representing input 
strings (separation in Fig. 3). 
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5. Molecular Automaton Example 


The molecular automaton (described in section 3) is represented in vessel by a set of 
production engines. For the language a(a|b)*b these molecules are depicted in Fig. 5. 
For the considered language there are 4 productions, the sequences correspond to 
reductions: al — 0, a1 1, b1— 1 and b 1 respectively. 


| end 1 a 0 end 
5 3 
E end 1 a 1 end " 
j end 1 b 1 end 
5 3 
š: end b 1 end 5 


Fig. 5. The molecular automaton (molecular engines) for the language a(a|b)*b 


The DNA strand representing a model string aab is shown in Fig. 6. As denoted 
previously there are sequences representing symbols, and some temporary ones, 
denoted start, end, used as primers. 


,| Start a b b end 
5 3' 


Fig. 6. DNA strand representing a model word abb 


Atthe beginning of calculations the molecules implementing the automaton (produc- 
tion engines) and input string are put into the vessel. Next the molecular productions 
are performed (firstly the hybridization, next polymerization with “Jump”, then dena- 
turation and finally polymerization, like in section 4). Because of the length of string 
aab, three steps of the molecular production and separation are performed. 

The first step is schematically depicted in Fig. 7. The production engine repre- 
senting reduction b — 1 bonds to the strand working as the input string (the reduction 
b— 1 is called to be active). The other engines are not bonded, so after the molecular 
production and separation in the vessel the molecule corresponding to aal string 
were presented. 

In the second step, depicted in Fig. 8, the reduction al 1 is active, and the 
string al is obtained. 

The third step (Fig. 9) shows parallelism of the described approach. The two 
reductions are active: al — 0 and al — 1. So two different molecules were created: 
molecules representing string 0 and 1 respectively. 
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JL Start a 1 end 3! 
start a 1 end i 


,L Start 0 end 
5 3! 


Fig. 9. Examine string aab (for the language a(a|b)*b). Step 3 


Finally, the detection is performed. It is done by checking if the molecule represent- 
ing the axiom (string 0 here) is presented in the vessel. Because such DNA strand 
is produced in the third step, then answer is true. The string aab is accepted by 
the molecular automaton. 


6. Experimental Verification 
The experiments realized in a genetic engineering laboratory confirmed the presented 
ideas. The process ofthe molecular production was performed and the possibility of 


building of the molecular automata using many of them was verified. The automata 
built in the laboratory were very simple, as depicted in Fig. 10. 


-00O 
Om Om 


Fig. 10. The finite automata build in the laboratory 
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6.1. Verification of Molecular Production 
The experiment was performed as described in section 4, details supplied below. 
The single stranded DNA molecule from M13 fag, described as m13, (genBank: 
www.ncbi.nlm.nih.gov; No X02513, produced by Amersham) was applied as the 
input string. The production engine called m13HAK was synthetic molecule having 
sequence provided in Table 1. 


Table 1. Sequence of DNA molecules used in the experiment 


Name Length Sequence 
m13 7249 M13mp18 — GenBank 
ml3PKO 21 CTA GCA CTA CAA CTC GGA CTA 
ml3HAK 55 GAG GTC ATT TTT GCG GAT GGC TTA GAG CTT CCG - 
GTA GTC CGA GTT GTA GTG CTA G 
ml3P 22 CTA TTA GTA GAA TTG ATG CCA C 


The hybridization was performed on the mixture showed in Table 2, which was 
heated to 95?C (denaturation all molecules) and kept in 72?C for 1 minute. As a re- 
sult, the structure illustrated in Fig. 11 was created. The molecule m13HAK sticked 
to m13 (sequences partly complement) and m13PKO (used as primer) hybridized to 
ml3HAK. 


Table 2. Mixture used for hybridization in verification of the molecular production 


Name Concentration Amount 
m13 0.44 pM/ul 5 ul 
ml3HAK 5.0 pM/ul 1 pl 
ml3PKO 5.0 pM/ul 1 pl 
Buffer 10x 4 ul 
H20 23 ul 
TOTAL 34 ul 
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Fig. 11. DNA strands created after hybridization 
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Next the polymerization with DNA polymerase “Jump” was performed. To mixture 
kept in 50°C the 5 ul of dNTP (20 uM/ul, Amersham) and 1 ul Taq DNA polymerase 
(2u/ul, produced by Institute of Biotechnology and Antibiotics from Thermophilus 
Aquatius) was added. 

After 20 minutes the mixture was treated with phenol and chloroform-isoamyl 
alcohol mixture, amplified by PCR with primers m13P and m13PKO (26 and 29 
cycles: 94°C 15sec, 50°C 15sec, 72°C 30sec). 10 ul ofthe PCR products were electro- 
phoresed on 6% polyacrylamide gels (acrylamide: bisacrylamide = 59:1) with TAE 
buffer, stained in ethidium bromide at 0.5 ug/L aqueous solution for 10 min. Image 
of the gel (Fig. 12) was made using a White/Ultraviolet Transiluminator. 
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Fig. 12. Gel image after the DNA polymerase “jump” reaction: strip 1,2 and 8: products of PCR (DNA 
before cutting); strip 3,5 and 9: pattern (1444, 736, 587, 476, 458, 434, 298, 267, 257, 174, 102, 80, 30); 
strip 4: Hinfl; strip 6: Hpall; strip 7: Rsal 


The molecule had proper length, however additional verification was performed. 
The products of PCR were cut by restrictazes Rsal, Hinfl, Hpall (Amersham). Strips 
4, 6 and 7 in Fig. 12 proved that the molecules had proper sequences. 

The estimated probability of DNA polymerase “jump” is about 3*10-8. This 
value was obtained comparing the brightness of the strips. 


6.2. Verification of Simple Molecular Automaton 


The two state automaton, depicted in Fig. 10 containing two molecular productions: 
a b and b — c was practically tested in the laboratory. In the experiment the input 
string xa was transformed into xb and then into xc. The simplified schema of this 
experiment is shown in Fig. 13. The idea was described in section 3 and section 5, 
details supplied below. 

The sequences of DNA molecules used in this experiments are showed in Table 3. 
The molecule called m13 represented the input string (xa), the aut represented the 
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string xb, the production engine ab was m13HAK2, the m13HAK3 was the produc- 
tion engine bc, the molecule m13Z was a support for molecular production and 
others (m13PKP, m13P, m13PKP2) were used as primers. 


x a — NER EN 
Eu 4 p——— — — — | — 
b a » Ib a 
E |: 
x b 
x b — X e b j 
m -pM _ x — D E 
, b M» fc b 
l: |: 
x C 


Fig. 13. Experiment for the finite state automaton. a) the molecular production a— b construct the xb 
string where xa is the input one, b) the second production b — c makes xc from xb 


Table 3. Sequence of DNA molecules used in the experiment 


Name Length Sequence 

m13 7249 M13mp18 — GenBank 

ml3HAK2 60 ATC TGG TGC TGT AGC TCA ACA TGT TCC GGA - 
GCA CCA GAT ATC TTC GAG TTG TAG TGC TAG 

m13Z 25 TGC TCC GGT TAA ATA TGC AAC TAA A 

ml3PKP 21 CTA CGA CTA CAA CTC GGA GAT 

ml3P 22 CTA TTA GTA GAA TTG ATG CCA C 

aut 218 the string created by molecular production a—b 

ml3HAK3 56 CGA AGA TAT CTG GTG CTC CGG CCG GAG CAC - 
GAG AAT TCC ATA GGA CCT TGC GCT CC 

ml3PKP2 21 GGA GCG CAA GGT CCT ATG GAA 


Firstly the molecules m13Z were kinazed (T4 kinaze) at 37°C for lhour, than the 
enzyme was thermally deactivated. Secondly the molecular production a — b was 
performed, next the production b — c was done, finally the molecule representing 
output string was checked. This process was described in section 3 from computer 
science point of view. 
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The molecular production a — b was performed in a different way than described 
in section 4. The respective steps were: 

1. Hybridization: the mixture showed in Fig. 14 was heated to 95°C for 1 minute 
(denaturation of all molecules) and kept at 72°C for 1 minute. The structure illustrated 
in Fig. 14 was created. The molecule m13Z sticked to both DNA strands, eliminating 
the DNA polymerase “Jump” and caused that molecular production was much more 
efficient. 

2. The polymerization: to the vessel containing the products of hybridization 
0.5 ul Klenov polimeraze (5 u/ul) were added, and the polymerization was performed 
at 20?C for 2 minutes. There were two starters: m13PKP and m13Z (Table 4, Fig. 14). 
The enzyme was thermally deactivated (the mixture was heated to 65?C ). 

3. The ligation: 10 ul of mixture after polymerization, 1 ul ligase enzyme (Am- 
ersham) and ligase buffer were kept at 16?C for 24hours. 

4. PCR: the products of ligation were amplified by PCR with primers m13P and 
m13PKP (cycles: 94°C 15sec, 50°C 15sec, 72°C 30sec). 

5. Check: 10 ul of the PCR products were electrophoresed on 6% polyacrylamide 
gels (acrylamide: bisacrylamide = 59:1) with TAE buffer and stained in ethidium 
bromide at 0.5 ug/L aqueous solution for 10 min. Image of the gel (Fig. 15) made 
by using a White/Ultraviolet Transiluminator proved the proper length of the aut 
molecule. 

6. Check: 10 ul of the PCR products were treated with phenol and chloroform- 
isoamyl alcohol mixture, and were cut by restrictazes EcoRV, Hinfl and MspI (Am- 
ersham) to provide that molecules have proper sequences. The gel image is shown 
in Fig. 16. 

It should be mentioned, that PCR(step 4) performed the separation (Fig. 3). This 


Table 4. Mixture for verification of the molecular production a — b 


Name Concentration Amount 
m13 0.1 pM/ul 2.0 ul 
ml13HAK2 6 pM/ul 1.0 ul 
m13Z 5 pM/ul 1.5 ul 
m13PKP 8 pM/ul 1.0 ul 
dNTP 20 uM/ul 2.0 ul 
Buffer 10x 2.0 ul 
H20 10.0 ul 
TOTAL 19.5 ul 


process produced the aut molecule of length 2185p and concentration 0.5 pM/ul 


(10 ng/ul). 
The molecular production b c is performed similarly. The PCR products of the 
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Fig. 14. DNA molecules used to verification of the molecular production a — b 
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Fig. 15. Molecules after a—>b molecular production: strip 1 and 7: PCR after 23 cycles, strip 2, 5, 6 
and 9: pattern (1444, 736, 587, 476, 458, 434, 298, 267, 257, 174, 102, 80, 30); strip 3: PCR 19 cycles; 
strip 4: PCR 15 cycles; strip 8: PCR 24 cycles 
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Fig. 16. Gel image after cutting aut molecule; strip 1: Mspl, strip 2: Hinfl, strip 3: EcoRV, strip 4: 
pattern (1444, 736, 587, 476, 458, 434, 298, 267, 257, 174, 102, 80, 30), strip 5: DNA before cutting 


production a— b, containing the aut molecule were used in: hybridization, polimeri- 
sation, ligation and amplification and then the molecule representing xc string was 
obtained, details supplied below. 

The hybridization mixture (Table 5) was heated to 95°C and kept at 72°C. 
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The structure showed in Fig. 17 was created. Next polimerisation and ligation was 
performed. Finally, the molecules were amplified by PCR with primers m13P and 
m13PKP2, electrophoresed and photographed (Fig. 18). The products of PCR were 
cut by restriction enzymes to prove having proper sequences. 

The experiment manifested, that many consecutive molecular can perform computa- 


Table 5. Mixture for verification of the molecular production a— b 


Name Concentration Amount 
aut 0.5 pM/ul 4 ul 
m13HAK3 10 pM/ul 2 ul 
m13Z 5 pM/ul 3 ul 
m13PKP2 10 pM/ul 2 ul 
dNTP 20 uM/ul 2 ul 
Buffer 10x 3 ul 
H20 13 pl 
TOTAL 29 ul 
5’ CTATTAGTAGAAT eee TTTAGTTGCATATTTAA---CCGGAGCACCAGATATCTCGAGTTGTAGTGCTAG 


AAATCAACGTATAAATT c GGCCTCGTGGTCTATAGAAGC 
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Fig. 17. Structure after hybridization when reaction to verify the molecular production b—c was performed 


501, 489, 476 — 
404 — 

331 — 

242 — 

190 — 

147 — 

111,110 — 


Fig. 18. Gel image after the production b—c. strip 1: pattern (1444, 736, 501, 489, 476, 404, 331, 242, 
190, 147, 111, 110); strip 2: PCR after 23 cycles; strip 3: PCR after 26 cycles; strip 4: PCR after 29 cycles 
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tions, thus it was demonstrated that molecular automata described in this work can 
be practically build and used. 


7. Conclusion 


The short comparison between complexity of different finite state automata is pre- 
sented in Table 6. The molecular approach has advantages over electronic imple- 
mentations, because each possible transitions from a given state are simultaneously 
considered (take advantage of massive parallel processing). 

There are a few others works describing realization of automata by using 


Table 6. Complexity of finite state automata described by the regular expression r when the string S 
is analyzed. Size for the molecular automaton is the number of different molecules used for 


calculation 
Automaton Size Time 
Electronic nondeterministic O(|r|) O(|r|*|S)) 
Electronic deterministic O(2|r|) O(|S]) 
Molecular O(|r|) O(|S]) 


the molecular approach. In [2] only propositions are given, in [7] the human assist- 
ance is needed. The described method employs a single vessel to code many states 
(because it implements a non-deterministic automaton), and the person reads the 
current symbol from the input string, and decides to which vessel the molecules 
should be added (simulating transition). It complicates the experiments and makes 
the process slower, more expensive and much prone to errors. The interesting idea 
shown in [8], which uses FokI enzyme, was experimentally proved. The main 
disadvantage of this method is small maximum number of states and transitions 
(256). 

The presented non-deterministic finite state automaton can be treated as an alter- 
native way of performing molecular computation. It is the step toward constructing 
a molecular computer. 

Practically, it might be used in biological and medical research, for searching 
DNA sequences described by the regular expression. When a requested sequence 
is simple (can be described by the regular expression), the described non-determin- 
istic automaton perform this task quickly and inexpensively (when compared with 
currently used DNA sequencing), so e.g. the diagnosis of a genetic disease can be 
performed on a large scale. 
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